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Method and system for microphone array input type speech recognition 



(54) 

(57) A microphone array input type speech recogni- 
tion scheme capable of realizing a high precision sound 
source position or direction estimation by a small 
amount of calculations, and thereby realizing a high pre- 
cision speech recognition. A band-pass waveform, 
which is a waveform for each frequency bandwidth, is 
obtained from input signals of the microphone array, 
and a band-pass power of the sound source is directly 
obtained from the band-pass waveform. Then, the 



obtained band-pass power is used as the speech 
parameter. It is also possible to realize the sound 
source estimation and the band-pass power estimation 
at high precision while further reducing an amount of 
calculations, by utilizing a sound source position search 
processing in which a low resolution position estimation 
and a high resolution position estimation are combined. 
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Description 

BACKGROUND OF THE INVENTION 
FIELD OF THE INVENTION 



The present invention relates to a microphone array input type speech recognition scheme in which speeches 
uttered by a user are inputted through microphone array and recognized. 

DESCRIPTION OF THE BACKGROUND ART 

recognition performance. In particular, background noises and '«^ s ™™ ,s ™ ™ | n !. „ J „, a speech recognition 
venfavlly encountered inconvenience descnbed above and tw. "y^^sTfchare al spatially 

» the speech waveform of the target sound source. ana ivsis is carried out for the obtained 

"^rr^raTerrrnSg^r^ 
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ontheFFT (Fast Fourier Transom, * out at a « 

position while minutely changing a direction or position over a range in wh^hthe J^^^JJ^n of sound 
^othatavery large amount of calculatons^ 

waves in forms of spher.cal waves, it is going to estimate a pos rton « ™eou conseque ntly an enormous amount 
of the sound waves, so that two- or three-dimensional scanning is necessary ana consequent 

difficult to reduce a required amount of calculations. 
15 SUMMARY OF THE INVENTION 

35 extraction unit for extracting a speech parameter Tor speewi i «^ a a -_ nrH i na to the sound source posit on or 

■wmm 

50 ^^TngTanX^ 

. 
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c^nn^con^esfepso^ 

by a plurality of niiaophones: analyzing an '^^^^"^'TZafoTm tor each frequency bandwidth; 

-■MM 

P ° W StS«s and advantages of the present invention wil, become apparent from the following description taken 
in conjunction with the accompanying drawings. 
BRIEF DESCRIPTION OF THE DRAWINGS 

Fia 1 is a block diagram of a conventional microphone array input type speech recognition system- 

Fig 2 l a block digram of a processing configuration for a conventionally proposed parametnc method for esfc- 

Figs. 5A and 5B are diagrams showing a relationship between a sound ^ ( 2^^32T 

^7££££££™ exemplary configuration for a speech recognition unit in the s/stem °f Fig. 3 
Rg 8 is a block diagram of another exemplary configuration for a speech recogn,t,on un,t ,n the system of Rg. 3. 

embodiment of the present invention. 

Fig. 13 is a flow chart for the processing of the sound source position search unrt of F.g. 12. 

DETAILED DESCRIPTION OF THE PREFERRED EMBODIMENTS 

Referring now to Fig. 3 to Fig. 1 0. the first embodiment of a method and a system for microphone array input type 

iment.Thfc ^sp'ch Section system of Fig. 1 comprises a speech input unit 1 . a frequency analyse unrt 2, a sound 
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s the sDeech parameter extraction unit 4 with a recognition dictionary. Ho crrih*ri 
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be be changed depending on a width of the bandwidth. 

Thefiiter output y in this «**^ f rt ^!L^^ iK.M) obtained from the waveform 

First, bydenotingthebandl^ J ample sequences: x lk = (x lk (n-J + 1) 

microphones, it is possible to obtain a vector gton by the following equation (1). 



X K =(X 1k .X 2K . . x Nk> 



(3) 



,o Aiso, by arranging the fi^^ 

w k = (w 11 .w 12 , ■«iJ- w 2i-"a- ' (2) 

w 2J . .w N1 .w N2 , .«nj) 

15 

Using the above equations (1) and (2), the filter output y can be expressed as: 
y = w k *x k 

E[y 2 ] = E[w k *x k x k *w k ] = w k *R k w k (4) 

microphone array for a target direction or position is to be ma.nta.ned constant. 
These constraint conditions can be expressed as: 

w k *A = g (5) 

~ac_ 
35 column vectors. This matrix A can be expressed as: 

A = [a n ,a 2 . .a J (6) 

and each direction control vector a, (m - 1 . 2. , L) can be expressed as: 

40 a -(1 a e' 1 "™ 2 , a N e" i<B,mN ) ^ 

equation (8). 

w k = R k ' 1 A(A*R k - 1 A)" 1 g (8) 

" Usingthisfiltercoef^ 

banS from the sound source 6 can be calculated as the follow.ng equat.on (9). 

p k (e) = g*(A*R k ' 1 A )' 1 9 (9) 
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in a case of the sound source position estimation, 9 is taken as a vector for expressing the coordinates. 

NcL ^McTid FioTsA and 5B, a manner of obtaining the propagation time difference and the ampl^de 
foreacTn^Xnew^ 



First, as shown in Fig. 
nates 

are incident from 
the first microphone is given by: 



T,(e) = ((x r x 1 ) 
and the amplitude can be assumed as: 



On the other hand, in a case of a point sound source, as shown in W*^"?^^™^ 
9 is located at (x.. yj, the propagation time difference x, and the amplrtude a, can be g.ven by. 

x, = ((( Xi -x 8 ) 2 + (y , -y 8 ) 2 ) 1/2 -((xrxj 2 - (V r*S) 1 * * (12) 



-((x,-x a ) 2 + (y l -yj 2 ) 1 ' a /l(x l -xj a + (y l -y.)') 1 ' 



(13) 



25 Whe p m aTvenTie'a^'equation (9) becomes large when 6 coincides with the arriving direction or the sound 
^urceposiS 

search r^ge The increment value to. e ma, 6e changed tc any appropriate value depending on factors such as a 

* ^,rr«rL?r^ 

bandwidth, and taking a sum for all the frequency bandwidths from k = 1 to k = M , that is. 

P(e)totai = zW K P h( e ) ° 4) 
40 and estimating the sound source from a peak on the distribution after this synthesizing processing (the total sound 

quench r 5ilSiwc B «ch as apower source frequency may be set smajl so , as to reduce the irt uenceof » ^ 
« The detection of the sound source is carried out according to a size of a peak in P(9), ote | as descriDea aDove, ana 

a sing e «™ -*»d * s the «"* SOurce - « ivel * as * ^ * IS SSTSS 

fhrichniri r«rim reference to an average value of portions other than peak portions on the synthesized (total) sound 
"SS^SSS^^Ab. and a., p'eaks above this threshold may be detected as the sound sources, 

50 J X T^^oZSr a L assumed sound source position which is set to be ™ 
determined with reference to positions of a plurality of microphones, so that this arnvmg power d.stnbtrton P k (6) will be 
referred to as the sound source position judgement information. 

N«rt the soeech parameter extraction unit 4 can extract the power of the k-th frequency bandw.dth of the sound 
ss ■ou^T^lSS^^^^ Power distribution P k (9) for each bandwidth, according to the sound source 
T*rtnnT^£n£teZd by the sound Source position search unit 3. Consequently, by extracting the power for all 
SJSSSKS.??^ kl M ""s Possible to obtain the band-pass power to be used as the speech parameter 
Th^nd paL power of the sound source obtained in this manner is sent from the speech parameter extracts 
unit 4 to the speech recognition unit 5, and used in the speech recognition processing. 
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As shown in Fig. 7, the speech recognition unit 5 comprises a speech power calculation unit 501 . a speech detec- 
tion unii 502 a pattern matching unit 503, and a recognition dictionary 504. 

in 11 soee^h recognition unit 5. the speech power is calculated from the speech parameter (tond-pass power) 

=Sa^ 



3^rr~ht^^ 

PaTmeSrl^e^ 

Lrce position estimation method ««h » high precision and e small amount oi «£«onj. 

Now the flow of process™ as descrtoed above will be summarized vMh reference to the now chart ol Rg. .s. 
Fta? SrtheSoTthe processing, the initial eettng Is rradelo, factors sud, as whetherad,rect,onest™to„ 

' ,m nZ SSTn — "h^ are M> converted a, the sampling .reouenc, of « KHz tor 

P»rh SS^v baSwidth k ( k = 1 to M ). where M = 16 in this example (step S3). Here, the calculations for the band- 

a " y ' Next at the source source position search unit 3. using the band-pass waveform data to ' N ^ n * s 

the freSenc^nalysJunit 2 at the step S3, the correlation matrix R k for each frequency bandwidth k .s obtained I (step 

S4 V Ere ^ as shown in Rg 1 0, the calculation of the correlation matrix R k is realized by obtaining tine correlation matnx 

frame data of 256 samples (points) for example. . • D thD amuinr. nower distribution 

In addition at the step S4, using this correlation matrix R k , the arriving power astnbuoon 
Pk(6) = g* T*> A) " g is obtained as the sound source position judgement information for each assumed port on 
or i irectJon ( This cal Jation is carried out over the entire space to be searched through, so as to obtain the spatial dis- 
tribution of the arriving powers. As for the bandwidths. when M = 1 6, the calculation is carried out from k = 1 to k = 16. 

Ne^ aUh^ sound See position search unit 3, the arriving power distributions P k (e) for different frequency bo* 
vM^^^L entire frequency bandwidths, for each e. so as to obtain the total sound source power d,s- 
SS^SL Then, the largest peak is extracted from this P(6) total and kientif ied as the sound source pos,t,on e 0 

(8t8P N S" at the speech parameter extraction unit 4, a value on the arriving power distribution (sound source position 
judgement nation distribution) P k (6) for each frequency bandwidth obtained by the ^ — - ^ 
SS 3 at the sound source position e 0 is extracted, and this is repeated for all the frequency bandwidths for each sound 
source so as to obtain the speech parameter P k (e 0 ) (step S6). , mmBf t fn 

In aSition at the step S6, the powers for different bandwidths k of the speech parameter P k (9) are summed to 
obtain t ? P ower foTthfentire speech bandwidth at the speech power calculation unit 501 of the speech recognition 

Unit Next, using the power for the entire speech bandwidth obtained at the step S6, the speech section is detected by 
tho cruaprh detection unit 502 of the speech recognition unit 5 (step S7). 

I^n whX^Hnd of the speech section is detected by the speech detection unit 502 or not is iudged (step 
80). I if not, the processing re turn, to the step S2 to carry out the frequency analysis for the next waveform data 
frame. 



The total speech power can 
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'^Thereafter the above process is repeated so as to carry out the speech parameter estimation and the speech 

possible to use two direction control vectors ajeO and a m (e 2 )(m- 1,2, . ;y x 
tions(15)and(16). 

a (9 ) = (l,a 2 e- iffi,T,T2(e1> , , a N e" i<omTN(e1) ) ( 15 > 

al(el) = (1,a 2 2 e«. W 

Then, using these two direction control vectors a m (0 1 ) and a m (e s ), it is possible to set: 

A-pi 1 (e l ).a,(ei). .a L (e 1 ).a 1 (e 2 ),a 2 (e 2 ), .a L (e 2 )] 07) 

so as to make responses of the microphone array to ^directions nt to a ^ of mak . 

—■»-■-•■«•"-■"■■- 

configuration of the sound source position search unit 3 in this second embodiment for realizing 
the above described sound source position estimation processing. 
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m this confiauration of Fig 12. the sound source position search unit 3 comprises a low resolution sound source 

ihe low resolution spectrum estimation. The high resolution sound source position search unrt 302 Id ensely est mates 
the empower distribution by using the high resolution spectrum estimation only in a v,c,n,ty of the position or direc 
tion obtained by the low resolution sound source position search unit 301. 

Now the L of processing in this sound source position search un* 3 in the conf.gurat.on of Rg. 12 w,.l be 

calculSd (step S1 1) Here, a method for obtaining this correlation matrix R k is the same as ,n the first embodimen t 

eaCh Nerthe t ow resolution arri^ng power distributions for different bandwidths are synthesized, and the sou 

search Ts wnied ouT Here, the setting of the search range is set to be ±1 0" of the sound source position obtained at 
Z S« sTa S a^rieAt this point, the equation to be used for the arriving power estimation (arriving power chtfri- 
Tution^s the SS« »: and The increment value is set to a smaller value such as 1 ° for example (step 

5 31 4 Next the high resolution arriving power distributions for different bandwidths obtained at the step S 14 are synthe- 
sized and the sound source position e 0 ' is obtained from a peak therein (step S1 5). 

AHhe soeSoarameter extraction unit 4, the power (speech parameter) of the sound source is extracted from the 
J!i^^SS^ by the high resolution sound source position search at the high resolution sound 
source position search unit 302 of the sound source position search unit 3. r , rr ^ occinn in which the low 

As described, in this second embodiment, by using the sound source posrt.on search processing , . whwh the low 
region sound source position estimation and the high resolution sound source pos.t.on est,mat.on ar e • combed .t 
T^S^SZ^ the sound source position and rts bandpass power while reducing an amount of calculates 

C ° nS it e d?scribed according to the present invention, a band-pass waveform which is a waveform for each frequency 
• bandwidth's^ 

obS from the band-pass waveform, so that it is possible to realize a high precision ^J£W^*£5 
Son estimation by a small amount of calculations. Moreover, the obtained band-pass power can be used as the speech 
narameter so that it is possible to realize a high precision speech recognition. 

P inte ieech Cognition system of the present invention, the input signals of the microphone array entered by £e 
,o spee^o^ 

is a waveform for each frequency bandwidth. This band-pass waveform .s obtained by using the band-pass filter oana 
ognition system. Then, the band-pass power of the sound source .s directly obtained from the obtained band pass 
« ^ H°e^ K^t^ collects, - juration (filter function) having a 

plural of deSay I n ta^ for each microphone channel is used and the sound source power is obtained as a sum of 
the ffl£ outpuSfor all channels, while using the minimum variance method which is a known high precision spectrum 

^^ouSMurce power estimation processing using the minimum variance method is also used in the conven- 
so tionaTy proposed parametric method described above, but a use of only one delay line tap has been assumed conven- 

tionally so that it has been impossible to obtain the bandwidth power collectively. 

In cont aS in the speech recognition system of the present invention, a filter configurator .with a^ur^ la 

line a^Ts used so that the power in each direction or position is obtained for each frequency bandwidth necessary for 

to^^SSSt San obtaining the power in each direction or position for each t^^SST £5 " 
55 the obtained power can be directly used for the speech recognition while a required amount of calculations can be 

r6dU Fof example, in a case of using the conventional FFT with 512 points, it has been necessary to r*«^* 

the power in each direction for each of 256 components, but in the present invention when a numb* of\ bands* the 

band-pass fitter bank is set to 1 6 for example, it suffices to estimate the power in each direction for 16 times. In addrt,on, 
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this pc^er (band-pass power) can be estimated at higher precision compared with the corwentiona. case of using the 

TMmmfmm 

ft fs ?so to be n^S that besides those already mentioned above, many modifications and var.at.ons of the above 
emb^t without departing from the novel and advantageous features of the P™£M£b«. 
toXral^ch mod^ications and variations are intended to be included within the scope of the appended cla.ms. 



A microphone array input type speech recognition system, comprising: 

a speech input unit for inputting speeches in a plurality of channels using a microphone array formed by a plu- 
TXZ°S^ for analyzing an input speech of each channel inputted by the speech input unu a „d 
obSng band-pass waveforms for each channel, each band-pass waveform bang a waveform for each fre- 

^sTund^urce position search unit for calculating a band-pass power distribution for each frequency band- 
wid^ lorn the band-pass waveforms for each frequency bandwidth obtained by the frequency 
sizing calculated band-pass power distributions for a plurality of frequency bandw,dths, and est.mat.ng 
a sound source position or direction from a synthesized band-pass power distr.but.on. mthahanri 
a speech parameter extraction unit for extracting a speech parameter for speech 

pa^power distribution for each frequency bandwidth calculated by the sound source l^cns^unrt. 
acSrdng to the sound source position or direction estimated by the sound source P os.t.on search un,t, and 
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a speech recognWon unit for obtaining a speech recognition result by matching the speech parameter 
extracted by the speech parameter extraction unit with a recognition d.ctionary. 

The system of claim 1 , wherein the sound source position search unit includes: 

a low resolution sound source position estimation unit for estimating a rough sound source position or direc- 
l££52ElS an output p^er of the microphone array under ^^^^ ° f m ' Cr °" 
ohone array for a plurality of directions or positions are to be maintained constant, and 
! h.nh resolution sound source position estimation unit for estimating an accurate sound source portion or 

rate sound source position or direction. 
. The system of claim 1 , wherein the frequency analysis unit obtains the band-pass waveforms for each channel by 
using a band-pass filter bank. 

The system of claim 1 , wherein the sound source position search unit calculates the '-"^PJ^'^^ 



channel. 



The system of claim 1 wherein the sound source position search unit calculates the band-pass power distribute 



tained constant. 



-, Th* .vcapm of claim 1 wherein the speech parameter extraction unit extracts the band-pass power distribution for 

respective weights, and summing weighted band-pass power distributions. 

9 The system of claim 1 , wherein the sound source position search unit estimates the sound source Potion or direc- 
Sn by SlgTpeak in the synthesized band-pass power distribution and setting a position or direction corre- 
sponding to a detected peak as the sound source position or direction. 

10. A microphone array input type speech analysis system, comprising: 

a speech input unit for inputting speeches in a plurality of channels using a microphone array formed by a plu- 
LZuencyS^ 

obSng banSss waveforms for each channel, each band-pass waveform being a waveform for each fre- 
S^roe^on search unit for calculating a band-pass power distribution for each frequency band- 
H?h ffnm thfband oass waveforms for each frequency bandwidth obtained by the frequency analysis unrt, 

a sound source position or direction from a synthesized band-pass power distribution, and 

"speech parameter extraction unrt for exacting a speech parameter from the band-^ 

each frequency bandwidth estimated by the sound source pos.t.on search unit, according to the sound source 

position or direction estimated by the sound source position search unit. 

11. The system of claim 10, wherein the sound source position search unit includes: 
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a lew resolution sound source position estimation unit for estimating a rough sound source position , or direc- 

lot^SEin output power o« the microphone array under 

phone array for a plurality of directions or positions are to be maintained constant, and 

fpS S^«er Son uni, J-.oK.hs apeo* Parameter acco-dino to ft. accurate sour.! source pc«. 
tion or direction. 

12. A microphone array input type speech analysis system, comprising: 

a speech input unrt for inputting speeches in a plurality of channels using a microphone array formed by a plu- 

Stai waveforms lor each channe., each band-pass waveform being a waveform for each fre- 

^SSJSSSn -arch unit for calcuiating a band-pass power distribution for each frequency band- 
1 m?band*ass waveforms for each frequency bandwidth obtained by the frequency analysis unit. 

a sound source position or direction from a synthesized band-pass power d«str.bution. 

13. The system of claim 12, wherein the sound source position search unit includes: 

a low resolution sound source position estimation unit for estimating a rough sound source position i or direc- 
tioTb ^minimizing an output p^er of the microphone array under constraints that responses of the micro- 
Dhone array for a plurality of directions or positions are to be maintained constant, and 
fSoh resolution sound source position estimation unit for estimating an accurate sound source position or 
SSonT^r^^r^ sound source oosKon or direction estimated by the low sound 
source position estimation unit, by minimizing the output power of the microphone array > under constants that 
a response of the microphone array for one direction or position ,s to be maintained constant. 

14. A microphone array input type speech recognition method, comprising the steps of: 



inputting speeches n a plurality or cnanneis us>iny « in«..up..w..« / ™ - ^ 0 „ a w m c 

Analyzing an input speech of each channel inputted by the inputting step and <*^*?£~ wav6f ° rmS 
L each channel each band-pass waveform being a waveform for each frequency bandwidth, 
ca culaSnl a ^ba^ Tass p^eTdistribution for each frequency bandwidth from the band-pass waveforms for 
eaSSenS bandwidth obtained by the analyzing step, synthesizing calculated 

tions foTa plurality of frequency bandwidths, and estimating a sound source position or direction from a syn- 
qSnS bar^dwidTh caLated by the calculating step, according to the sound source position or direction 

a recognition dictionary. 
15. The method of claim 14, wherein the calculating step includes the steps of: 

a low resolution sound source position estimation step for estimating a rough sound source position , or direc- 
tion by minimizing an output power of the microphone array under constraints hat responses of the micro- 
ohone array for a plurality of directions or positions are to be maintained constant; and 
a Ngh SuSon Sound l source position estimation step for estimating an accurate sound source position or 
direSon S r v icin% of the rough sound source position or direction estimated by the low resolution .wund 
source POsitL estimation step, by minimizing the cutout power of *7^ 0 ^^ 
a resoonse of the microphone array for one direction or position is to be ma.nta.ned constant, wherein tne 
Sng step extracts L speech parameter for speech recognition according to the accurate sound source 
position or direction. 
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16. The method of claim 1 4, wherein the analyzing step obtains the band-pass waveforms for each channel by using a 
band-pass filter bank. 

17. The method of claim 14. wherein the calculating step calculates the ^^^S^?^^ 
quency bandwidth, by calculating a band-pass power for each frequency bandw.dth, ,n each one of a plurahty of 
assumed sound source positions or directions within a prescribed search range. 

18 The method of claim 14. wherein the calculating step calculates the band-pass power distribution for each fre- 
q^enc^bandwidth by using a fitter function configuration having a plurality of delay line taps for each channel. 

19 The method of claim 14. wherein the calculating step calculates the band-pass power distribution , fo. - each fre- 
quency bandwidth by using a minimum variance method for minimizing an output power of the m cro ^ one 
u^consS 

20. The method of claim 14. wherein the extracting step extracts the band-pass ^^^^^SSZ 
bandwidth calculated by the calculating step for the sound source position or d.rect.on estimated by the calculating 
step directly as the speech parameter. 

21 The method of claim 1 4. wherein the calculating step synthesizes the calculated band-pass power distributions for 
r P luTality ol Quency bandwidths by weighting the calculated band-pass power distributes with respective 
weights, and summing weighted band-pass power distributions. 

22 The method of claim 1 4. wherein the calculating step estimates the sound source position or direction by during 
TpeTk in the synthesized band-pass power distribution and setting a position or direction corresponding to a 
detected peak as the sound source position or direction. 

23. A microphone array input type speech analysis method, comprising the steps of: 

inputting speeches in a plurality of channels using a microphone array formed by a plurality of 

analyzing an input speech of each channel inputted by the inputting step, and obt^ningband-pass waveforms 

for each channel each band-pass waveform being a waveform for each frequency bandwidth, 

SiculaVing a band Pass ^distribution for each frequency bandwidth from the band-pass waveforms for 

eac Cenc ^bandwidS obtained by the analyzing step, synthesizing calculated t»^P""*^ 

tions fara plurality of frequency bandwidths. and estimating a sound source position or d.recbon from a syn- 

rrcfn^ 

by the calculating step, according to the sound source position or direction estimated by the calculate step. 

24. The method of claim 23, wherein the calculating step includes the steps of: 

a low resolution sound source position estimation step for estimating a rough sound source Potion or direc- 
tion by minimizing an output power of the microphone array under constraints that responses of the m,cro- 
phone array for a plurality of directions or positions are to be maintained constant; and 
a high resolution sound source position estimation step for estimating an accurate sound* source positon or 
direction in a vicinity of the rough sound source position or direction estimated by the low resolute , sound 
source position estimation step, by minimizing the output power of the microphone array under constraints that 
a res%nse of the microphone array for one direction or position is to be maintained constant, wherein the 
extracting step extracts the speech parameter according to the accurate sound source position or direction. 

25. A microphone array input type speech analysis method, comprising the steps of: 

inputting speeches in a plurality of channels using a microphone array formed by a plurality of m ^ones, 
analyzing an input speech of each channel inputted by the inputting step, and obta.n,ng band-pass waveforms 
for each channel, each band-pass waveform being a waveform for each frequency bandwidth; and 
calculating a band-pass power distribution for each frequency bandwidth from the band-pass waveforms for 
obtained by the anting step, synthesizing calculated band-pass power distribu- 
tions feTa plurality of frequency bandwidths, and estimating a sound source position or d.rection from a syn- 
thesized band-pass power distribution. 
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i. The method of claim 25, wherein the calculating step includes the steps of: 

a low resolution sound source position estimation step for estimating a rough sound source position i or direc- 
tion by Minimizing an output power of the microphone array under consents that responses of the m,cro- 
ohone array for a plurality of directions or positions are to be maintained constant; and 
a h°ah SuSon sound source position estimation step for estimating an accurate sound source ponton or 
iSK^SSo?*. rough sound source portion or direction estimated by the low resolution sound 
lr^ 

a response of the microphone array for one direction or position is to be ma.nta.ned constant. 
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FIG.5A 
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